Global Clustering Quality Coefficient Assessing the Efficiency of PCA Class Assignment
نویسندگان
چکیده
An essential factor influencing the efficiency of the predictive models built with principal component analysis (PCA) is the quality of the data clustering revealed by the score plots. The sensitivity and selectivity of the class assignment are strongly influenced by the relative position of the clusters and by their dispersion. We are proposing a set of indicators inspired from analytical geometry that may be used for an objective quantitative assessment of the data clustering quality as well as a global clustering quality coefficient (GCQC) that is a measure of the overall predictive power of the PCA models. The use of these indicators for evaluating the efficiency of the PCA class assignment is illustrated by a comparative study performed for the identification of the preprocessing function that is generating the most efficient PCA system screening for amphetamines based on their GC-FTIR spectra. The GCQC ranking of the tested feature weights is explained based on estimated density distributions and validated by using quadratic discriminant analysis (QDA).
منابع مشابه
Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملSimultaneous Multi-Skilled Worker Assignment and Mixed-Model Two-Sided Assembly Line Balancing
This paper addresses a multi-objective mathematical model for the mixed-model two-sided assembly line balancing and worker assignment with different skills. In this problem, the operation time of each task is dependent on the skill of the worker. The following objective functions are considered in the mathematical model: (1) minimizing the number of mated-stations (2), minimizing the number of ...
متن کاملFunctional Brain Connectivity Differences Between Different ADHD Presentations: Impaired Functional Segregation in ADHD-Combined Presentation but not in ADHD-Inattentive Presentation
Introduction: Contrary to Diagnostic and Statistical Manual of Mental Disorders (DSM-5), fifth edition, some studies indicate that ADHD-inattentive presentation (ADHD-I) is a distinct diagnostic disorder and not an ADHD presentation. Methods: In this study, 12 ADHD-combined presentation (ADHD-C), 10 ADHD-I, and 13 controls were enrolled and their resting state EEG recorded. Following thi...
متن کاملMining Gene Expression Data Using PCA Based Clustering
As the amount of laboratory data in molecular biology and bioinformatics grows exponentially in each year due to advanced technologies such as DNA Microarray, new efficient and effective clustering methods must be developed to process this fast growing amount of biological data. Numerous clustering techniques have been applied in the analysis of gene expression data to extract biologically sign...
متن کاملApplication of international energy efficiency standards for energy auditing in a University buildings
This study seeks to provide insights on understanding the contemporary problems of energy efficiency in Ukrainian universities by developing a comprehensive energy efficiency management framework that encompasses its participating subjects, objects and key drivers along with suggesting its implementation mechanism and tools. Emphasis should be given that the current situation of inefficient and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 2014 شماره
صفحات -
تاریخ انتشار 2014